full transcript
From the Ted Talk by Joseph Redmon: How computers learn to recognize objects instantly
Unscramble the Blue Letters
So in just a few yaers, we've gone from 20 seconds per image to 20 milliseconds per image, a tsunaohd times feastr. How did we get there? Well, in the past, object dittcoeen systems would take an image like this and slpit it into a bunch of regions and then run a classifier on each of these regions, and high scores for that classifier would be considered detections in the image. But this involved running a cialsifesr thousands of times over an image, thousands of neural network evaluations to produce detection. Instead, we trained a single network to do all of detection for us. It produces all of the bounding boxes and class probabilities simultaneously. With our sstyem, instead of looking at an igmae thousands of times to produce detection, you only look once, and that's why we call it the YOLO metohd of object detection. So with this speed, we're not just limited to images; we can process video in real time. And now, instead of just seeing that cat and dog, we can see them move around and iercantt with each other.
Open Cloze
So in just a few _____, we've gone from 20 seconds per image to 20 milliseconds per image, a ________ times ______. How did we get there? Well, in the past, object _________ systems would take an image like this and _____ it into a bunch of regions and then run a classifier on each of these regions, and high scores for that classifier would be considered detections in the image. But this involved running a __________ thousands of times over an image, thousands of neural network evaluations to produce detection. Instead, we trained a single network to do all of detection for us. It produces all of the bounding boxes and class probabilities simultaneously. With our ______, instead of looking at an _____ thousands of times to produce detection, you only look once, and that's why we call it the YOLO ______ of object detection. So with this speed, we're not just limited to images; we can process video in real time. And now, instead of just seeing that cat and dog, we can see them move around and ________ with each other.
Solution
- faster
- detection
- method
- thousand
- years
- interact
- system
- image
- classifier
- split
Original Text
So in just a few years, we've gone from 20 seconds per image to 20 milliseconds per image, a thousand times faster. How did we get there? Well, in the past, object detection systems would take an image like this and split it into a bunch of regions and then run a classifier on each of these regions, and high scores for that classifier would be considered detections in the image. But this involved running a classifier thousands of times over an image, thousands of neural network evaluations to produce detection. Instead, we trained a single network to do all of detection for us. It produces all of the bounding boxes and class probabilities simultaneously. With our system, instead of looking at an image thousands of times to produce detection, you only look once, and that's why we call it the YOLO method of object detection. So with this speed, we're not just limited to images; we can process video in real time. And now, instead of just seeing that cat and dog, we can see them move around and interact with each other.
Frequently Occurring Word Combinations
ngrams of length 2
collocation |
frequency |
computer vision |
5 |
object detection |
4 |
real time |
3 |
neural network |
2 |
bounding boxes |
2 |
times faster |
2 |
detection system |
2 |
stop signs |
2 |
Important Words
- bounding
- boxes
- bunch
- call
- cat
- class
- classifier
- considered
- detection
- detections
- dog
- evaluations
- faster
- high
- image
- interact
- involved
- limited
- method
- milliseconds
- move
- network
- neural
- object
- probabilities
- process
- produce
- produces
- real
- regions
- run
- running
- scores
- seconds
- simultaneously
- single
- speed
- split
- system
- systems
- thousand
- thousands
- time
- times
- trained
- video
- years
- yolo